Problem Statement¶

Business Context¶

Workplace safety in hazardous environments like construction sites and industrial plants is crucial to prevent accidents and injuries. One of the most important safety measures is ensuring workers wear safety helmets, which protect against head injuries from falling objects and machinery. Non-compliance with helmet regulations increases the risk of serious injuries or fatalities, making effective monitoring essential, especially in large-scale operations where manual oversight is prone to errors and inefficiency.

To overcome these challenges, SafeGuard Corp plans to develop an automated image analysis system capable of detecting whether workers are wearing safety helmets. This system will improve safety enforcement, ensuring compliance and reducing the risk of head injuries. By automating helmet monitoring, SafeGuard aims to enhance efficiency, scalability, and accuracy, ultimately fostering a safer work environment while minimizing human error in safety oversight.

Objective¶

As a data scientist at SafeGuard Corp, you are tasked with developing an image classification model that classifies images into one of two categories:

  • With Helmet: Workers wearing safety helmets.
  • Without Helmet: Workers not wearing safety helmets.

Data Description¶

The dataset consists of 631 images, equally divided into two categories:

  • With Helmet: 311 images showing workers wearing helmets.
  • Without Helmet: 320 images showing workers not wearing helmets.

Dataset Characteristics:

  • Variations in Conditions: Images include diverse environments such as construction sites, factories, and industrial settings, with variations in lighting, angles, and worker postures to simulate real-world conditions.
  • Worker Activities: Workers are depicted in different actions such as standing, using tools, or moving, ensuring robust model learning for various scenarios.

Installing and Importing the Necessary Libraries¶

In [33]:
%pip install numpy pandas matplotlib seaborn scikit-learn opencv-python tensorflow keras pillow  -q
Note: you may need to restart the kernel to use updated packages.
In [34]:
import tensorflow as tf
print("Num GPUs Available:", len(tf.config.list_physical_devices('GPU')))
print(tf.__version__)
Num GPUs Available: 0
2.20.0

Note:

  • After running the above cell, kindly restart the notebook kernel (for Jupyter Notebook) or runtime (for Google Colab) and run all cells sequentially from the next cell.

  • On executing the above line of code, you might see a warning regarding package dependencies. This error message can be ignored as the above code ensures that all necessary libraries and their dependencies are maintained to successfully execute the code in this notebook.

In [35]:
import os
import random
import numpy as np                                                                               # Importing numpy for Matrix Operations
import pandas as pd
import seaborn as sns
import matplotlib.image as mpimg                                                                              # Importing pandas to read CSV files
import matplotlib.pyplot as plt                                                                  # Importting matplotlib for Plotting and visualizing images
import math                                                                                      # Importing math module to perform mathematical operations
import cv2


# Tensorflow modules
import keras
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator                              # Importing the ImageDataGenerator for data augmentation
from tensorflow.keras.models import Sequential                                                   # Importing the sequential module to define a sequential model
from tensorflow.keras.layers import Dense,Dropout,Flatten,Conv2D,MaxPooling2D,BatchNormalization # Defining all the layers to build our CNN Model
from tensorflow.keras.optimizers import Adam,SGD                                                 # Importing the optimizers which can be used in our model
from sklearn import preprocessing                                                                # Importing the preprocessing module to preprocess the data
from sklearn.model_selection import train_test_split                                             # Importing train_test_split function to split the data into train and test
from sklearn.metrics import confusion_matrix
from tensorflow.keras.models import Model
from keras.applications.vgg16 import VGG16                                               # Importing confusion_matrix to plot the confusion matrix

# Display images using OpenCV
# from google.colab.patches import cv2_imshow

#Imports functions for evaluating the performance of machine learning models
from sklearn.metrics import confusion_matrix, f1_score,accuracy_score, recall_score, precision_score, classification_report
from sklearn.metrics import mean_squared_error as mse                                                 # Importing cv2_imshow from google.patches to display images

# Ignore warnings
import warnings
warnings.filterwarnings('ignore')
In [36]:
# Set the seed using keras.utils.set_random_seed. This will set:
# 1) `numpy` seed
# 2) backend random seed
# 3) `python` random seed
tf.keras.utils.set_random_seed(812)

Data Overview¶

Loading the data¶

In [37]:
images = np.load('images_proj.npy')
labels = pd.read_csv('labels_proj.csv')
In [38]:
print(f'Images shape: {images.shape}')
print(f'Labels shape: {labels.shape}')

print(labels.value_counts())
Images shape: (631, 200, 200, 3)
Labels shape: (631, 1)
Label
0        320
1        311
Name: count, dtype: int64
In [39]:
print("minimum value of the image array is ",np.min(images[0]))
print("maximum value of the image array is",np.max(images[0]))
minimum value of the image array is  0
maximum value of the image array is 255

Observations¶

  • The no of images given to train and test split are less --> 631
  • The image counts can be considered as balanced 320, 311

Exploratory Data Analysis¶

Plot random images from each of the classes and print their corresponding labels.¶

In [40]:
def plot_sample_images(images, labels, num_samples=4):
    with_helmet_indices = labels[labels['Label'] == 1].index.tolist()
    without_helmet_indices = labels[labels['Label'] == 0].index.tolist()
    plt.figure(figsize=(12, 6))
    for i in range(num_samples):
        idx = random.choice(with_helmet_indices)
        plt.subplot(2, num_samples, i + 1)
        plt.imshow(images[idx])
        plt.title('With Helmet')
        plt.axis('off')
    
    # Plot random images without helmet
    for i in range(num_samples):
        idx = random.choice(without_helmet_indices)
        plt.subplot(2, num_samples, num_samples + i + 1)
        plt.imshow(images[idx])
        plt.title('Without Helmet')
        plt.axis('off')
    
    plt.tight_layout()
    plt.show()

# Display random samples of images with and without helmets
plot_sample_images(images,labels)
No description has been provided for this image
Observation¶
  • the data sample looks to be very simple, as without helmet images are mostly closeups to face

did run this sample multiple times and saw that almost all images without helmet are closeups.

Checking for class imbalance¶

In [41]:
# Count the number of images in each class
class_distribution = labels['Label'].value_counts()

# Create a bar plot
plt.figure(figsize=(10, 6))
sns.barplot(x=class_distribution.index, y=class_distribution.values)
plt.title('Distribution of Classes')
plt.xlabel('Class (0: Without Helmet, 1: With Helmet)')
plt.ylabel('Number of Images')

# Add value labels on top of each bar
for i, v in enumerate(class_distribution.values):
    plt.text(i, v, str(v), ha='center', va='bottom')

plt.show()

# Calculate the percentage distribution
percentage_distribution = (class_distribution / len(labels) * 100).round(2)
print("\nPercentage Distribution:")
for class_label, percentage in percentage_distribution.items():
    print(f"Class {class_label}: {percentage}%")
No description has been provided for this image
Percentage Distribution:
Class 0: 50.71%
Class 1: 49.29%
Observations¶
  • As observed earlier, we are seeing that the distribution is fairly even
  • the data sample looks to be very simple, as without helmet images are mostly closeups to face
  • The no of images given to train and test split are less --> 631
  • The image counts can be considered as balanced 320, 311
  • The data need to be normalized to use as values are between 0 and 255

Data Preprocessing¶

Converting images to grayscale¶

In [42]:
# Function to convert RGB images to grayscale
def convert_to_grayscale(images):
    gray_images = []
    for img in images:
        # Convert RGB to grayscale using cv2
        gray_img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
        # Add channel dimension for CNN input (H, W, 1)
        gray_img = gray_img[..., np.newaxis]
        gray_images.append(gray_img)
    return np.array(gray_images)

# Convert images to grayscale
gray_images = convert_to_grayscale(images)

# Display sample images before and after conversion
plt.figure(figsize=(12, 6))
for i in range(3):
    # Original RGB image
    plt.subplot(2, 3, i + 1)
    plt.imshow(images[i])
    plt.title('Original RGB')
    plt.axis('off')
    
    # Grayscale image
    plt.subplot(2, 3, i + 4)
    plt.imshow(gray_images[i].squeeze(), cmap='gray')
    plt.title('Grayscale')
    plt.axis('off')

plt.tight_layout()
plt.show()

print("Original image shape:", images[0].shape)
print("Grayscale image shape:", gray_images[0].shape)
No description has been provided for this image
Original image shape: (200, 200, 3)
Grayscale image shape: (200, 200, 1)
In [43]:
# Apply Gaussian blur to the grayscale images
def apply_gaussian_blur(images, kernel_size=(5,5)):
    blurred_images = []
    for img in images:
        # Apply Gaussian blur
        blurred = cv2.GaussianBlur(img, kernel_size, 0)
        blurred_images.append(blurred)
    return np.array(blurred_images)

# Apply blur to grayscale images
blurred_images = apply_gaussian_blur(gray_images)
In [44]:
# Apply Laplacian edge detection to the grayscale images
def apply_laplacian(images, ksize=3):
    laplacian_images = []
    for img in images:
        # Apply Gaussian blur first to reduce noise
        blurred = cv2.GaussianBlur(img, (5,5), 0)
        # Apply Laplacian
        laplacian = cv2.Laplacian(blurred, cv2.CV_64F, ksize=ksize)
        # Convert back to uint8 and normalize to 0-255 range
        laplacian = np.uint8(np.absolute(laplacian))
        laplacian_images.append(laplacian)
    return np.array(laplacian_images)

# Apply Laplacian edge detection on grayscale images
laplacian_images = apply_laplacian(gray_images)

# Display sample images to compare all preprocessing steps
plt.figure(figsize=(15, 8))
for i in range(3):
    # Original RGB image
    plt.subplot(4, 3, i + 1)
    plt.imshow(images[i])
    plt.title('Original RGB')
    plt.axis('off')
    
    # Grayscale image
    plt.subplot(4, 3, i + 4)
    plt.imshow(gray_images[i].squeeze(), cmap='gray')
    plt.title('Grayscale')
    plt.axis('off')
    
    # Blurred image
    plt.subplot(4, 3, i + 7)
    plt.imshow(blurred_images[i].squeeze(), cmap='gray')
    plt.title('Gaussian Blur')
    plt.axis('off')
    
    # Laplacian image
    plt.subplot(4, 3, i + 10)
    plt.imshow(laplacian_images[i].squeeze(), cmap='gray')
    plt.title('Laplacian Edge Detection')
    plt.axis('off')

plt.tight_layout()
plt.show()

print("Original image shape:", images[0].shape)
print("Grayscale image shape:", gray_images[0].shape)
print("Blurred image shape:", blurred_images[0].shape)
print("Laplacian image shape:", laplacian_images[0].shape)
No description has been provided for this image
Original image shape: (200, 200, 3)
Grayscale image shape: (200, 200, 1)
Blurred image shape: (200, 200)
Laplacian image shape: (200, 200)
preparing images¶
  • we can use gray scale or blurred images if we need to improve the performance fo the model.
  • later stages of the project we can decide to use these images if needed.
  • General observations based on above images, we can see edges and curves in various styles of the image filters

Splitting the dataset¶

In [45]:
# images, gray_images, blurred_images, laplacian_images
# splitting the data into training and testing sets , since there are only 631 samples, we are using 60% for training, 20% for validation and 20% for testing
# splitting rgb images
x_train_rgb, x_temp_rgb, y_train_rgb, y_temp_rgb = train_test_split(images, labels['Label'], test_size=0.4, random_state=42, stratify=labels['Label'])
x_val_rgb, x_test_rgb, y_val_rgb, y_test_rgb = train_test_split(x_temp_rgb, y_temp_rgb, test_size=0.5, random_state=42, stratify=y_temp_rgb)
In [46]:
# splitting grayscale images
x_train_gray, x_temp_gray, y_train_gray, y_temp_gray = train_test_split(gray_images, labels['Label'], test_size=0.4, random_state=42, stratify=labels['Label'])
x_val_gray, x_test_gray, y_val_gray, y_test_gray = train_test_split(x_temp_gray, y_temp_gray, test_size=0.5, random_state=42, stratify=y_temp_gray)
In [47]:
# splitting blurred images
#x_train_blur, x_temp_blur, y_train_blur, y_temp_blur = train_test_split(blurred_images, labels['Label'], test_size=0.4, random_state=42, stratify=labels['Label'])
#x_val_blur, x_test_blur, y_val_blur, y_test_blur = train_test_split(x_temp_blur, y_temp_blur, test_size=0.5, random_state=42, stratify=y_temp_blur)
In [48]:
# splitting laplacian images
#x_train_lap, x_temp_lap, y_train_lap, y_temp_lap = train_test_split(laplacian_images, labels['Label'], test_size=0.4, random_state=42, stratify=labels['Label'])
#x_val_lap, x_test_lap, y_val_lap, y_test_lap = train_test_split(x_temp_lap, y_temp_lap, test_size=0.5, random_state=42, stratify=y_temp_lap)
  • We can also try ANN route to experiment with by using different filters (commented as these are out of scope for this assignment)

Data Normalization¶

In [49]:
# label binarizer is not needed as the labels are already in binary format (0 and 1)
# as we observed earlier in data exploration the values are between 0 and 255, lets normalize them
# normalizing rgb images
x_train_normalized_rgb = x_train_rgb.astype('float32') / 255.0
x_val_normalized_rgb = x_val_rgb.astype('float32') / 255.0
x_test_normalized_rgb = x_test_rgb.astype('float32') / 255.0
In [50]:
# normalizing grayscale images
x_train_normalized_gray = x_train_gray.astype('float32') / 255.0
x_val_normalized_gray = x_val_gray.astype('float32') / 255.0
x_test_normalized_gray = x_test_gray.astype('float32') / 255.0
In [51]:
# normalizing blurred images
# x_train_normalized_blur = x_train_blur.astype('float32') / 255.0
# x_val_normalized_blur = x_val_blur.astype('float32') / 255.0
# x_test_normalized_blur = x_test_blur.astype('float32') / 255.0
In [52]:
# normalizing laplacian images
# x_train_normalized_lap = x_train_lap.astype('float32') / 255.0
# x_val_normalized_lap = x_val_lap.astype('float32') / 255.0
# x_test_normalized_lap = x_test_lap.astype('float32') / 255.0
  • Normalized the data sets to train the models

  • We can also try ANN instead of CNN using the above converted images (out of scope for this assignment)

  • we will build all our models now for RGB .

Model Building¶

Model Evaluation Criterion¶

Utility Functions¶

In [53]:
# defining a function to compute different metrics to check performance of a classification model built using statsmodels
def model_performance_classification(model, predictors, target):
    """
    Function to compute different metrics to check classification model performance

    model: classifier
    predictors: independent variables
    target: dependent variable
    """

    # checking which probabilities are greater than threshold
    pred = model.predict(predictors).reshape(-1)>0.5

    target = target.to_numpy().reshape(-1)


    acc = accuracy_score(target, pred)  # to compute Accuracy
    recall = recall_score(target, pred, average='weighted')  # to compute Recall
    precision = precision_score(target, pred, average='weighted')  # to compute Precision
    f1 = f1_score(target, pred, average='weighted')  # to compute F1-score

    # creating a dataframe of metrics
    df_perf = pd.DataFrame({"Accuracy": acc, "Recall": recall, "Precision": precision, "F1 Score": f1,},index=[0],)

    return df_perf
In [54]:
def plot_confusion_matrix(model,predictors,target,ml=False):
    """
    Function to plot the confusion matrix

    model: classifier
    predictors: independent variables
    target: dependent variable
    ml: To specify if the model used is an sklearn ML model or not (True means ML model)
    """

    # checking which probabilities are greater than threshold
    pred = model.predict(predictors).reshape(-1)>0.5

    target = target.to_numpy().reshape(-1)

    # Plotting the Confusion Matrix using confusion matrix() function which is also predefined tensorflow module
    confusion_matrix = tf.math.confusion_matrix(target,pred)
    f, ax = plt.subplots(figsize=(10, 8))
    sns.heatmap(
        confusion_matrix,
        annot=True,
        linewidths=.4,
        fmt="d",
        square=True,
        ax=ax
    )
    plt.show()

Model 1: Simple Convolutional Neural Network (CNN)¶

In [55]:
# Basic CNN model for helmet detection (using RGB images)
cnn_model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=x_train_rgb.shape[1:]),#200,200,3
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Flatten(),
    Dense(64, activation='relu'),
    Dropout(0.5),
    Dense(1, activation='sigmoid')
])

cnn_model.compile(optimizer=Adam(learning_rate=0.001),
                  loss='binary_crossentropy',
                  metrics=['accuracy'])

cnn_model.summary()
Model: "sequential_2"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d_4 (Conv2D)               │ (None, 198, 198, 32)   │           896 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_4 (MaxPooling2D)  │ (None, 99, 99, 32)     │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_5 (Conv2D)               │ (None, 97, 97, 64)     │        18,496 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_5 (MaxPooling2D)  │ (None, 48, 48, 64)     │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten_2 (Flatten)             │ (None, 147456)         │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_4 (Dense)                 │ (None, 64)             │     9,437,248 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_2 (Dropout)             │ (None, 64)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_5 (Dense)                 │ (None, 1)              │            65 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 9,456,705 (36.07 MB)
 Trainable params: 9,456,705 (36.07 MB)
 Non-trainable params: 0 (0.00 B)
In [56]:
# Fit the basic CNN model using normalized RGB images
history_basic_cnn = cnn_model.fit(
    x_train_normalized_rgb, y_train_rgb,
    epochs=20,
    batch_size=32,
    validation_data=(x_val_normalized_rgb, y_val_rgb),
    verbose=2,
    shuffle=True
)
Epoch 1/20
12/12 - 4s - 337ms/step - accuracy: 0.6878 - loss: 1.2283 - val_accuracy: 0.9762 - val_loss: 0.1515
Epoch 2/20
12/12 - 3s - 215ms/step - accuracy: 0.9683 - loss: 0.1333 - val_accuracy: 0.9762 - val_loss: 0.0673
Epoch 3/20
12/12 - 3s - 219ms/step - accuracy: 0.9735 - loss: 0.0921 - val_accuracy: 0.9762 - val_loss: 0.0603
Epoch 4/20
12/12 - 3s - 222ms/step - accuracy: 0.9947 - loss: 0.0324 - val_accuracy: 0.9762 - val_loss: 0.0703
Epoch 5/20
12/12 - 3s - 229ms/step - accuracy: 0.9894 - loss: 0.0352 - val_accuracy: 1.0000 - val_loss: 0.0074
Epoch 6/20
12/12 - 3s - 213ms/step - accuracy: 0.9815 - loss: 0.0404 - val_accuracy: 0.9841 - val_loss: 0.0336
Epoch 7/20
12/12 - 2s - 200ms/step - accuracy: 0.9921 - loss: 0.0204 - val_accuracy: 1.0000 - val_loss: 0.0109
Epoch 8/20
12/12 - 3s - 212ms/step - accuracy: 0.9947 - loss: 0.0162 - val_accuracy: 0.9762 - val_loss: 0.0961
Epoch 9/20
12/12 - 2s - 198ms/step - accuracy: 0.9947 - loss: 0.0195 - val_accuracy: 0.9683 - val_loss: 0.0947
Epoch 10/20
12/12 - 6s - 478ms/step - accuracy: 0.9974 - loss: 0.0102 - val_accuracy: 1.0000 - val_loss: 0.0090
Epoch 11/20
12/12 - 4s - 326ms/step - accuracy: 0.9735 - loss: 0.0728 - val_accuracy: 0.9841 - val_loss: 0.0430
Epoch 12/20
12/12 - 3s - 255ms/step - accuracy: 0.9921 - loss: 0.0311 - val_accuracy: 1.0000 - val_loss: 0.0100
Epoch 13/20
12/12 - 6s - 541ms/step - accuracy: 0.9974 - loss: 0.0224 - val_accuracy: 0.9762 - val_loss: 0.0851
Epoch 14/20
12/12 - 3s - 261ms/step - accuracy: 0.9974 - loss: 0.0153 - val_accuracy: 0.9762 - val_loss: 0.0732
Epoch 15/20
12/12 - 2s - 202ms/step - accuracy: 1.0000 - loss: 0.0067 - val_accuracy: 1.0000 - val_loss: 0.0042
Epoch 16/20
12/12 - 3s - 211ms/step - accuracy: 1.0000 - loss: 0.0063 - val_accuracy: 1.0000 - val_loss: 0.0027
Epoch 17/20
12/12 - 2s - 199ms/step - accuracy: 1.0000 - loss: 0.0014 - val_accuracy: 1.0000 - val_loss: 0.0029
Epoch 18/20
12/12 - 2s - 201ms/step - accuracy: 1.0000 - loss: 0.0028 - val_accuracy: 1.0000 - val_loss: 0.0036
Epoch 19/20
12/12 - 2s - 202ms/step - accuracy: 1.0000 - loss: 0.0030 - val_accuracy: 1.0000 - val_loss: 0.0015
Epoch 20/20
12/12 - 2s - 202ms/step - accuracy: 1.0000 - loss: 0.0024 - val_accuracy: 0.9921 - val_loss: 0.0094
In [57]:
def plot_training_history(history):
    """
    Function to plot training and validation accuracy and loss
    history: History object returned by model.fit()
    """
    # Plot training and validation accuracy and loss
    plt.figure(figsize=(14, 5))
    plt.subplot(1, 2, 1)
    plt.plot(history.history['accuracy'], label='Train Accuracy')
    plt.plot(history.history['val_accuracy'], label='Val Accuracy')
    plt.title('Model Accuracy')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy')
    plt.legend()

    plt.subplot(1, 2, 2)
    plt.plot(history.history['loss'], label='Train Loss')
    plt.plot(history.history['val_loss'], label='Val Loss')
    plt.title('Model Loss')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.legend()

    plt.tight_layout()
    plt.show()

plot_training_history(history_basic_cnn)
No description has been provided for this image
In [58]:
performance_test_basic_cnn = model_performance_classification(cnn_model, x_test_normalized_rgb, y_test_rgb)
print("Performance of Basic CNN Model on Test set of RGB Images:")
print(performance_test_basic_cnn)
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 54ms/step
Performance of Basic CNN Model on Test set of RGB Images:
   Accuracy    Recall  Precision  F1 Score
0  0.992126  0.992126   0.992249  0.992126
In [59]:
# Plot confusion matrix for test set predictions
plot_confusion_matrix(cnn_model, x_test_normalized_rgb, y_test_rgb, ml=True)
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 47ms/step
No description has been provided for this image
In [60]:
performance_val_basic_cnn = model_performance_classification(cnn_model, x_val_normalized_rgb, y_val_rgb)
print("Performance of Basic CNN Model on val set of RGB Images:")
print(performance_val_basic_cnn)
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 42ms/step
Performance of Basic CNN Model on val set of RGB Images:
   Accuracy    Recall  Precision  F1 Score
0  0.992063  0.992063   0.992186  0.992062
In [61]:
# Plot confusion matrix for val set predictions
plot_confusion_matrix(cnn_model, x_val_normalized_rgb, y_val_rgb, ml=True)
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 43ms/step
No description has been provided for this image
In [62]:
performance_train_basic_cnn = model_performance_classification(cnn_model, x_train_normalized_rgb, y_train_rgb)
print("Performance of Basic CNN Model on Training set of RGB Images:")
print(performance_train_basic_cnn)
12/12 ━━━━━━━━━━━━━━━━━━━━ 1s 43ms/step
Performance of Basic CNN Model on Training set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
In [63]:
# Plot confusion matrix for val set predictions
plot_confusion_matrix(cnn_model, x_train_normalized_rgb, y_train_rgb, ml=True)
12/12 ━━━━━━━━━━━━━━━━━━━━ 1s 44ms/step
No description has been provided for this image

Vizualizing the predictions¶

In [64]:
def plot_sample_predictions_on_val_set(model):   
     # Visualize predictions on val images
    num_samples = 8
    plt.figure(figsize=(16, 8))
    pred_probs = model.predict(x_val_normalized_rgb[:num_samples])
    pred_labels = (pred_probs > 0.5).astype(int).reshape(-1)
    for i in range(num_samples):
        plt.subplot(2, num_samples//2, i+1)
        plt.imshow(x_val_rgb[i])
        true_label = y_val_rgb.iloc[i] if hasattr(y_val_rgb, 'iloc') else y_val_rgb[i]
        plt.title(f"True: {'Helmet' if true_label==1 else 'No Helmet'}\nPred: {'Helmet' if pred_labels[i]==1 else 'No Helmet'}")
        plt.axis('off')
    plt.tight_layout()
    plt.show()

def plot_sample_predictions_on_test_set(model):   
     # Visualize predictions on val images
    num_samples = 8
    plt.figure(figsize=(16, 8))
    pred_probs = model.predict(x_test_normalized_rgb[:num_samples])
    pred_labels = (pred_probs > 0.5).astype(int).reshape(-1)
    for i in range(num_samples):
        plt.subplot(2, num_samples//2, i+1)
        plt.imshow(x_test_rgb[i])
        true_label = y_test_rgb.iloc[i] if hasattr(y_test_rgb, 'iloc') else y_test_rgb[i]
        plt.title(f"True: {'Helmet' if true_label==1 else 'No Helmet'}\nPred: {'Helmet' if pred_labels[i]==1 else 'No Helmet'}")
        plt.axis('off')
    plt.tight_layout()
    plt.show()

plot_sample_predictions_on_val_set(cnn_model)

plot_sample_predictions_on_test_set(cnn_model)
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 25ms/step
No description has been provided for this image
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 25ms/step
No description has been provided for this image
  • A simple CNN peformed very well with the given data
  • lets try with a pretrained CNN VGG16 model to see how it works
In [65]:
# lets try the same model with grayscale images
# Basic CNN model for helmet detection (using grayscale images)

cnn_model_gray = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=x_train_gray.shape[1:]),#200,200,1
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Flatten(),
    Dense(64, activation='relu'),
    Dropout(0.5),
    Dense(1, activation='sigmoid')
])
cnn_model_gray.compile(optimizer=Adam(learning_rate=0.001),
                  loss='binary_crossentropy',
                  metrics=['accuracy'])
cnn_model_gray.summary()
Model: "sequential_3"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d_6 (Conv2D)               │ (None, 198, 198, 32)   │           320 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_6 (MaxPooling2D)  │ (None, 99, 99, 32)     │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_7 (Conv2D)               │ (None, 97, 97, 64)     │        18,496 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_7 (MaxPooling2D)  │ (None, 48, 48, 64)     │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten_3 (Flatten)             │ (None, 147456)         │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_6 (Dense)                 │ (None, 64)             │     9,437,248 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_3 (Dropout)             │ (None, 64)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_7 (Dense)                 │ (None, 1)              │            65 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 9,456,129 (36.07 MB)
 Trainable params: 9,456,129 (36.07 MB)
 Non-trainable params: 0 (0.00 B)
In [66]:
# Fit the basic CNN model using normalized RGB images
history_basic_cnn_gray = cnn_model_gray.fit(
    x_train_normalized_gray, y_train_gray,
    epochs=20,
    batch_size=32,
    validation_data=(x_val_normalized_gray, y_val_gray),
    verbose=2,
    shuffle=True
)
Epoch 1/20
12/12 - 3s - 272ms/step - accuracy: 0.6402 - loss: 0.7202 - val_accuracy: 0.9444 - val_loss: 0.3066
Epoch 2/20
12/12 - 2s - 187ms/step - accuracy: 0.9339 - loss: 0.1733 - val_accuracy: 0.9762 - val_loss: 0.0882
Epoch 3/20
12/12 - 2s - 186ms/step - accuracy: 0.9868 - loss: 0.0583 - val_accuracy: 0.9921 - val_loss: 0.0166
Epoch 4/20
12/12 - 2s - 187ms/step - accuracy: 0.9894 - loss: 0.0338 - val_accuracy: 0.9841 - val_loss: 0.0312
Epoch 5/20
12/12 - 2s - 189ms/step - accuracy: 1.0000 - loss: 0.0158 - val_accuracy: 0.9841 - val_loss: 0.0533
Epoch 6/20
12/12 - 2s - 187ms/step - accuracy: 0.9974 - loss: 0.0136 - val_accuracy: 0.9921 - val_loss: 0.0075
Epoch 7/20
12/12 - 2s - 185ms/step - accuracy: 0.9974 - loss: 0.0116 - val_accuracy: 0.9921 - val_loss: 0.0193
Epoch 8/20
12/12 - 2s - 191ms/step - accuracy: 1.0000 - loss: 0.0067 - val_accuracy: 0.9841 - val_loss: 0.0533
Epoch 9/20
12/12 - 2s - 187ms/step - accuracy: 0.9974 - loss: 0.0100 - val_accuracy: 1.0000 - val_loss: 0.0022
Epoch 10/20
12/12 - 2s - 195ms/step - accuracy: 1.0000 - loss: 0.0035 - val_accuracy: 0.9921 - val_loss: 0.0294
Epoch 11/20
12/12 - 2s - 184ms/step - accuracy: 0.9974 - loss: 0.0090 - val_accuracy: 1.0000 - val_loss: 0.0021
Epoch 12/20
12/12 - 2s - 187ms/step - accuracy: 1.0000 - loss: 0.0033 - val_accuracy: 1.0000 - val_loss: 0.0038
Epoch 13/20
12/12 - 2s - 184ms/step - accuracy: 1.0000 - loss: 0.0018 - val_accuracy: 0.9921 - val_loss: 0.0174
Epoch 14/20
12/12 - 2s - 183ms/step - accuracy: 0.9974 - loss: 0.0030 - val_accuracy: 1.0000 - val_loss: 0.0011
Epoch 15/20
12/12 - 2s - 182ms/step - accuracy: 1.0000 - loss: 5.5501e-04 - val_accuracy: 0.9921 - val_loss: 0.0088
Epoch 16/20
12/12 - 2s - 187ms/step - accuracy: 1.0000 - loss: 4.6206e-04 - val_accuracy: 0.9921 - val_loss: 0.0160
Epoch 17/20
12/12 - 2s - 184ms/step - accuracy: 1.0000 - loss: 2.8315e-04 - val_accuracy: 0.9921 - val_loss: 0.0183
Epoch 18/20
12/12 - 2s - 194ms/step - accuracy: 1.0000 - loss: 7.0238e-04 - val_accuracy: 0.9921 - val_loss: 0.0068
Epoch 19/20
12/12 - 2s - 184ms/step - accuracy: 1.0000 - loss: 3.7387e-04 - val_accuracy: 0.9921 - val_loss: 0.0083
Epoch 20/20
12/12 - 2s - 196ms/step - accuracy: 0.9974 - loss: 0.0053 - val_accuracy: 0.9921 - val_loss: 0.0375
In [67]:
plot_training_history(history_basic_cnn_gray)
No description has been provided for this image
In [68]:
performance_test_basic_cnn_gray = model_performance_classification(cnn_model_gray, x_test_normalized_gray, y_test_gray)
print("Performance of Basic CNN Model on Test set of Grayscale Images:")
print(performance_test_basic_cnn_gray)
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 51ms/step
Performance of Basic CNN Model on Test set of Grayscale Images:
   Accuracy    Recall  Precision  F1 Score
0  0.968504  0.968504   0.968962  0.968492
In [69]:
# Plot confusion matrix for test set predictions
plot_confusion_matrix(cnn_model_gray, x_test_normalized_gray, y_test_gray, ml=True)
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 37ms/step
No description has been provided for this image
  • The gray scale images also performed very well with the basic cnn model.
  • we can continue to test the basic cnn or create an ANN with the gray scale or blurred or lap filters
  • but for this exericse lets continue with the vgg-16 models and see how the perforamnce is

Model 2: (VGG-16 (Base))¶

In [70]:
# VGG16-based model for helmet detection (using RGB images)
from tensorflow.keras.applications import VGG16
from tensorflow.keras.layers import GlobalAveragePooling2D
from tensorflow.keras.models import Model

# Load VGG16 base (without top, with imagenet weights)
vgg_base = VGG16(weights='imagenet', include_top=False, input_shape=x_train_rgb.shape[1:])

# Making all the layers of the VGG model non-trainable. i.e. freezing them
for layer in vgg_base.layers:
    layer.trainable = False

vgg_base.summary()

vgg16_model = Sequential() # Initializing the Sequential model
vgg16_model.add(vgg_base) # Adding the VGG16 base model
vgg16_model.add(Flatten())# Flattening the output of the VGG16 model
vgg16_model.add(Dense(1, activation='sigmoid'))

vgg16_model.compile(optimizer=Adam(learning_rate=0.001),
                    loss='binary_crossentropy',
                    metrics=['accuracy'])

vgg16_model.summary()
Model: "vgg16"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ input_layer_4 (InputLayer)      │ (None, 200, 200, 3)    │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block1_conv1 (Conv2D)           │ (None, 200, 200, 64)   │         1,792 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block1_conv2 (Conv2D)           │ (None, 200, 200, 64)   │        36,928 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block1_pool (MaxPooling2D)      │ (None, 100, 100, 64)   │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block2_conv1 (Conv2D)           │ (None, 100, 100, 128)  │        73,856 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block2_conv2 (Conv2D)           │ (None, 100, 100, 128)  │       147,584 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block2_pool (MaxPooling2D)      │ (None, 50, 50, 128)    │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block3_conv1 (Conv2D)           │ (None, 50, 50, 256)    │       295,168 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block3_conv2 (Conv2D)           │ (None, 50, 50, 256)    │       590,080 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block3_conv3 (Conv2D)           │ (None, 50, 50, 256)    │       590,080 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block3_pool (MaxPooling2D)      │ (None, 25, 25, 256)    │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block4_conv1 (Conv2D)           │ (None, 25, 25, 512)    │     1,180,160 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block4_conv2 (Conv2D)           │ (None, 25, 25, 512)    │     2,359,808 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block4_conv3 (Conv2D)           │ (None, 25, 25, 512)    │     2,359,808 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block4_pool (MaxPooling2D)      │ (None, 12, 12, 512)    │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block5_conv1 (Conv2D)           │ (None, 12, 12, 512)    │     2,359,808 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block5_conv2 (Conv2D)           │ (None, 12, 12, 512)    │     2,359,808 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block5_conv3 (Conv2D)           │ (None, 12, 12, 512)    │     2,359,808 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ block5_pool (MaxPooling2D)      │ (None, 6, 6, 512)      │             0 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 14,714,688 (56.13 MB)
 Trainable params: 0 (0.00 B)
 Non-trainable params: 14,714,688 (56.13 MB)
Model: "sequential_4"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ vgg16 (Functional)              │ (None, 6, 6, 512)      │    14,714,688 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten_4 (Flatten)             │ (None, 18432)          │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_8 (Dense)                 │ (None, 1)              │        18,433 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 14,733,121 (56.20 MB)
 Trainable params: 18,433 (72.00 KB)
 Non-trainable params: 14,714,688 (56.13 MB)
In [71]:
trainDataGen = ImageDataGenerator()
In [72]:
# Fit the VGG16 model using normalized RGB images
history_vgg16 = vgg16_model.fit(trainDataGen.flow(x_train_normalized_rgb, y_train_rgb, batch_size=32,shuffle=False),
    epochs=20,
    validation_data=(x_val_normalized_rgb, y_val_rgb),
    verbose=2
)
Epoch 1/20
12/12 - 27s - 2s/step - accuracy: 0.8995 - loss: 0.1998 - val_accuracy: 1.0000 - val_loss: 0.0213
Epoch 2/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 0.0081 - val_accuracy: 1.0000 - val_loss: 0.0064
Epoch 3/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 0.0029 - val_accuracy: 1.0000 - val_loss: 0.0046
Epoch 4/20
12/12 - 26s - 2s/step - accuracy: 1.0000 - loss: 0.0020 - val_accuracy: 1.0000 - val_loss: 0.0042
Epoch 5/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 0.0016 - val_accuracy: 1.0000 - val_loss: 0.0039
Epoch 6/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 0.0014 - val_accuracy: 1.0000 - val_loss: 0.0038
Epoch 7/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 0.0013 - val_accuracy: 1.0000 - val_loss: 0.0035
Epoch 8/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 0.0012 - val_accuracy: 1.0000 - val_loss: 0.0034
Epoch 9/20
12/12 - 26s - 2s/step - accuracy: 1.0000 - loss: 0.0011 - val_accuracy: 1.0000 - val_loss: 0.0033
Epoch 10/20
12/12 - 26s - 2s/step - accuracy: 1.0000 - loss: 0.0010 - val_accuracy: 1.0000 - val_loss: 0.0032
Epoch 11/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 9.3872e-04 - val_accuracy: 1.0000 - val_loss: 0.0031
Epoch 12/20
12/12 - 26s - 2s/step - accuracy: 1.0000 - loss: 8.6986e-04 - val_accuracy: 1.0000 - val_loss: 0.0030
Epoch 13/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 8.1396e-04 - val_accuracy: 1.0000 - val_loss: 0.0029
Epoch 14/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 7.6172e-04 - val_accuracy: 1.0000 - val_loss: 0.0028
Epoch 15/20
12/12 - 26s - 2s/step - accuracy: 1.0000 - loss: 7.1410e-04 - val_accuracy: 1.0000 - val_loss: 0.0027
Epoch 16/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 6.7181e-04 - val_accuracy: 1.0000 - val_loss: 0.0026
Epoch 17/20
12/12 - 26s - 2s/step - accuracy: 1.0000 - loss: 6.3435e-04 - val_accuracy: 1.0000 - val_loss: 0.0026
Epoch 18/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 5.9627e-04 - val_accuracy: 1.0000 - val_loss: 0.0025
Epoch 19/20
12/12 - 26s - 2s/step - accuracy: 1.0000 - loss: 5.6484e-04 - val_accuracy: 1.0000 - val_loss: 0.0024
Epoch 20/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 5.3532e-04 - val_accuracy: 1.0000 - val_loss: 0.0023
In [73]:
plot_training_history(history_vgg16)
No description has been provided for this image
In [74]:
performance_train_basic_vgg = model_performance_classification(vgg16_model, x_train_normalized_rgb, y_train_rgb)
print("Performance of Basic VGG Model on Training set of RGB Images:")
print(performance_train_basic_vgg)
12/12 ━━━━━━━━━━━━━━━━━━━━ 20s 2s/step
Performance of Basic VGG Model on Training set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
In [75]:
plot_confusion_matrix(vgg16_model, x_train_normalized_rgb, y_train_rgb, ml=True)
12/12 ━━━━━━━━━━━━━━━━━━━━ 19s 2s/step
No description has been provided for this image
In [76]:
performance_val_basic_vgg = model_performance_classification(vgg16_model, x_val_normalized_rgb, y_val_rgb)
print("Performance of Basic CNN vgg16 Model on Val set of RGB Images:")
print(performance_val_basic_vgg)
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
Performance of Basic CNN vgg16 Model on Val set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
In [77]:
plot_confusion_matrix(vgg16_model, x_val_normalized_rgb, y_val_rgb, ml=True)
4/4 ━━━━━━━━━━━━━━━━━━━━ 7s 2s/step
No description has been provided for this image
In [78]:
# peroformance classification on test set
performance_test_basic_vgg = model_performance_classification(vgg16_model, x_test_normalized_rgb, y_test_rgb)
print("Performance of Basic CNN vgg16 Model on Test set of RGB Images:")
print(performance_test_basic_vgg)
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
Performance of Basic CNN vgg16 Model on Test set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
In [79]:
# Plot confusion matrix for test set predictions
plot_confusion_matrix(vgg16_model, x_test_normalized_rgb, y_test_rgb, ml=True)
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
No description has been provided for this image

Visualizing the prediction:¶

In [80]:
print("sample visulization of predictions on Validation set ")

plot_sample_predictions_on_val_set(vgg16_model)
print("sample visulization of predictions on test set ")
plot_sample_predictions_on_test_set(vgg16_model)
sample visulization of predictions on Validation set 
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 413ms/step
No description has been provided for this image
sample visulization of predictions on test set 
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 411ms/step
No description has been provided for this image

Observations from VGG16 Model Results¶

  • The VGG16 transfer learning model achieved strong accuracy on both validation and test sets, indicating good generalization.

  • Training and validation accuracy curves show minimal overfitting, likely due to frozen base layers and regularization (Dropout).

  • The confusion matrix reveals that the model distinguishes well between 'Helmet' and 'No Helmet' classes.

  • VGG and basic CNN performed almost same for this usecase. VGG being pretrained can still be good choice to be ready for future scenarios

  • Further improvements could be made with adding fully connected layers, data augmentation, fine-tuning, or experimenting with other architectures.

Model 3: (VGG-16 (Base + FFNN))¶

In [81]:
# lets create a feed forward neural network using the extracted features from VGG16 model
vgg16_ffnn_model = Sequential()
vgg16_ffnn_model.add(vgg_base) # Adding the VGG16 base model
vgg16_ffnn_model.add(Flatten())# Flattening the output of the VGG 

# Adding fully connected layers
vgg16_ffnn_model.add(Dense(128, activation='relu'))
vgg16_ffnn_model.add(Dropout(0.5))
vgg16_ffnn_model.add(Dense(64, activation='relu'))

vgg16_ffnn_model.add(Dense(1, activation='sigmoid'))
In [82]:
vgg16_ffnn_model.compile(optimizer=Adam(learning_rate=0.001),
                    loss='binary_crossentropy',
                    metrics=['accuracy'])
vgg16_ffnn_model.summary()
Model: "sequential_5"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ vgg16 (Functional)              │ (None, 6, 6, 512)      │    14,714,688 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten_5 (Flatten)             │ (None, 18432)          │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_9 (Dense)                 │ (None, 128)            │     2,359,424 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_4 (Dropout)             │ (None, 128)            │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_10 (Dense)                │ (None, 64)             │         8,256 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_11 (Dense)                │ (None, 1)              │            65 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 17,082,433 (65.16 MB)
 Trainable params: 2,367,745 (9.03 MB)
 Non-trainable params: 14,714,688 (56.13 MB)
In [83]:
# Fit the vgg16_ffnn_model  using normalized RGB images
history_vgg16_ffnn = vgg16_ffnn_model.fit(trainDataGen.flow(x_train_normalized_rgb, y_train_rgb, batch_size=32,shuffle=False),
    epochs=20,
    validation_data=(x_val_normalized_rgb, y_val_rgb),
    verbose=2
)
# re using the trainDataGen defined earlier for default data augmentation
Epoch 1/20
12/12 - 26s - 2s/step - accuracy: 0.8095 - loss: 0.4135 - val_accuracy: 0.9921 - val_loss: 0.0140
Epoch 2/20
12/12 - 25s - 2s/step - accuracy: 0.9974 - loss: 0.0123 - val_accuracy: 1.0000 - val_loss: 6.4031e-04
Epoch 3/20
12/12 - 25s - 2s/step - accuracy: 0.9974 - loss: 0.0073 - val_accuracy: 1.0000 - val_loss: 7.5962e-04
Epoch 4/20
12/12 - 25s - 2s/step - accuracy: 0.9974 - loss: 0.0046 - val_accuracy: 1.0000 - val_loss: 2.8602e-04
Epoch 5/20
12/12 - 25s - 2s/step - accuracy: 0.9974 - loss: 0.0052 - val_accuracy: 1.0000 - val_loss: 2.2116e-04
Epoch 6/20
12/12 - 26s - 2s/step - accuracy: 1.0000 - loss: 0.0011 - val_accuracy: 1.0000 - val_loss: 6.7130e-04
Epoch 7/20
12/12 - 26s - 2s/step - accuracy: 1.0000 - loss: 0.0012 - val_accuracy: 1.0000 - val_loss: 3.5986e-04
Epoch 8/20
12/12 - 26s - 2s/step - accuracy: 1.0000 - loss: 8.2859e-04 - val_accuracy: 1.0000 - val_loss: 6.7191e-04
Epoch 9/20
12/12 - 25s - 2s/step - accuracy: 0.9974 - loss: 0.0030 - val_accuracy: 1.0000 - val_loss: 5.1351e-04
Epoch 10/20
12/12 - 26s - 2s/step - accuracy: 1.0000 - loss: 8.2787e-04 - val_accuracy: 1.0000 - val_loss: 1.7627e-04
Epoch 11/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 2.4610e-04 - val_accuracy: 1.0000 - val_loss: 3.7951e-04
Epoch 12/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 2.8586e-04 - val_accuracy: 1.0000 - val_loss: 3.0774e-04
Epoch 13/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 1.6748e-04 - val_accuracy: 1.0000 - val_loss: 2.2281e-04
Epoch 14/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 3.2890e-04 - val_accuracy: 1.0000 - val_loss: 2.7520e-04
Epoch 15/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 1.3805e-04 - val_accuracy: 1.0000 - val_loss: 3.6553e-04
Epoch 16/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 4.5668e-04 - val_accuracy: 1.0000 - val_loss: 2.8049e-04
Epoch 17/20
12/12 - 26s - 2s/step - accuracy: 1.0000 - loss: 5.4425e-04 - val_accuracy: 1.0000 - val_loss: 1.5664e-04
Epoch 18/20
12/12 - 26s - 2s/step - accuracy: 1.0000 - loss: 0.0017 - val_accuracy: 1.0000 - val_loss: 0.0019
Epoch 19/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 0.0011 - val_accuracy: 1.0000 - val_loss: 0.0034
Epoch 20/20
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 2.2428e-04 - val_accuracy: 1.0000 - val_loss: 8.8601e-04
In [84]:
plot_training_history(history_vgg16_ffnn)
No description has been provided for this image
In [85]:
# performance classification on validation set
performance_val_vgg16_ffnn = model_performance_classification(vgg16_ffnn_model, x_val_normalized_rgb, y_val_rgb)
print("Performance of VGG16 FFNN Model on Val set of RGB Images:")
print(performance_val_vgg16_ffnn)
4/4 ━━━━━━━━━━━━━━━━━━━━ 7s 2s/step
Performance of VGG16 FFNN Model on Val set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
In [86]:
# confusion matrix for validation set
plot_confusion_matrix(vgg16_ffnn_model, x_val_normalized_rgb, y_val_rgb, ml=True)
4/4 ━━━━━━━━━━━━━━━━━━━━ 7s 2s/step
No description has been provided for this image
In [87]:
# performance classification on test set
performance_test_vgg16_ffnn = model_performance_classification(vgg16_ffnn_model, x_test_normalized_rgb, y_test_rgb)
print("Performance of VGG16 FFNN Model on Test set of RGB Images:")
print(performance_test_vgg16_ffnn)
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
Performance of VGG16 FFNN Model on Test set of RGB Images:
   Accuracy    Recall  Precision  F1 Score
0  0.992126  0.992126   0.992249  0.992126
In [88]:
# confusion matrix for test set
plot_confusion_matrix(vgg16_ffnn_model, x_test_normalized_rgb, y_test_rgb, ml=True)
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
No description has been provided for this image
In [89]:
performance_train_vgg16_ffnn = model_performance_classification(vgg16_ffnn_model, x_train_normalized_rgb, y_train_rgb)
print("Performance of Basic VGG Model with FFNN on Training set of RGB Images:")
print(performance_train_vgg16_ffnn)
12/12 ━━━━━━━━━━━━━━━━━━━━ 19s 2s/step
Performance of Basic VGG Model with FFNN on Training set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
In [90]:
# confusion matrix for train set
plot_confusion_matrix(vgg16_ffnn_model, x_train_normalized_rgb, y_train_rgb, ml=True)
12/12 ━━━━━━━━━━━━━━━━━━━━ 19s 2s/step
No description has been provided for this image

Visualizing the predictions¶

In [91]:
print("sample visulization of predictions on Validation set ")

plot_sample_predictions_on_val_set(vgg16_ffnn_model)
print("sample visulization of predictions on test set ")
plot_sample_predictions_on_test_set(vgg16_ffnn_model)
sample visulization of predictions on Validation set 
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 422ms/step
No description has been provided for this image
sample visulization of predictions on test set 
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 411ms/step
No description has been provided for this image

Observations¶

  • VGG16-Base + FFNN performed well
  • This model performance is same as base model, perhaps may be because the size of the data .
  • These pretrain models are learning fast as the pictures are clearly distinguishable.

Model 4: (VGG-16 (Base + FFNN + Data Augmentation)¶

  • In most of the real-world case studies, it is challenging to acquire a large number of images and then train CNNs.

  • To overcome this problem, one approach we might consider is Data Augmentation.

  • CNNs have the property of translational invariance, which means they can recognise an object even if its appearance shifts translationally in some way. - Taking this attribute into account, we can augment the images using the techniques listed below

    • Horizontal Flip (should be set to True/False)
    • Vertical Flip (should be set to True/False)
    • Height Shift (should be between 0 and 1)
    • Width Shift (should be between 0 and 1)
    • Rotation (should be between 0 and 180)
    • Shear (should be between 0 and 1)
    • Zoom (should be between 0 and 1) etc.

Remember, data augmentation should not be used in the validation/test data set.

In [92]:
# lets create a feed forward neural network using the extracted features from VGG16 model (same as model-3 but for clarity i am defining again)
vgg16_ffnn_da_model = Sequential()
vgg16_ffnn_da_model.add(vgg_base) # Adding the VGG16 base model
vgg16_ffnn_da_model.add(Flatten())# Flattening the output of the VGG 

# Adding fully connected layers
vgg16_ffnn_da_model.add(Dense(128, activation='relu'))
vgg16_ffnn_da_model.add(Dropout(0.5))
vgg16_ffnn_da_model.add(Dense(64, activation='relu'))

vgg16_ffnn_da_model.add(Dense(1, activation='sigmoid'))

# the idea here is we will use data augmentation to train the model
In [93]:
vgg16_ffnn_da_model.compile(optimizer=Adam(learning_rate=0.001),
                    loss='binary_crossentropy',
                    metrics=['accuracy'])
vgg16_ffnn_da_model.summary()
Model: "sequential_6"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ vgg16 (Functional)              │ (None, 6, 6, 512)      │    14,714,688 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten_6 (Flatten)             │ (None, 18432)          │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_12 (Dense)                │ (None, 128)            │     2,359,424 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_5 (Dropout)             │ (None, 128)            │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_13 (Dense)                │ (None, 64)             │         8,256 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_14 (Dense)                │ (None, 1)              │            65 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 17,082,433 (65.16 MB)
 Trainable params: 2,367,745 (9.03 MB)
 Non-trainable params: 14,714,688 (56.13 MB)
In [ ]:
from tensorflow.keras.callbacks import EarlyStopping

# since we are going to use data augumentation we will define a new ImageDataGenerator with some data augmentation techniques
dataGenAugmented = ImageDataGenerator(
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.15,
    zoom_range=0.15,
    horizontal_flip=True,
    fill_mode='nearest'
)

# also lets define a early stopping callback to prevent overfitting

early_stopping = EarlyStopping(
    monitor='val_loss',  # Monitor validation loss
    patience=5,         # Stop if no improvement for 5 epochs
    mode='min',          # Minimize the validation loss
    verbose=1,
    restore_best_weights=True  # Restore best weights found during training
)
In [96]:
# Fit the vgg16_ffnn_da_model  using normalized RGB images
history_vgg16_ffnn_da_model = vgg16_ffnn_da_model.fit(trainDataGen.flow(x_train_normalized_rgb, y_train_rgb, batch_size=32,shuffle=False),
    epochs=200, # increased epochs since we have early stopping
    callbacks=[early_stopping],
    validation_data=(x_val_normalized_rgb, y_val_rgb),
    verbose=2
)
Epoch 1/200
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 1.1077e-04 - val_accuracy: 1.0000 - val_loss: 1.4701e-06
Epoch 2/200
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 2.4386e-05 - val_accuracy: 1.0000 - val_loss: 1.3341e-06
Epoch 3/200
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 1.7757e-05 - val_accuracy: 1.0000 - val_loss: 1.2750e-06
Epoch 4/200
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 3.4366e-05 - val_accuracy: 1.0000 - val_loss: 1.2407e-06
Epoch 5/200
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 1.1623e-05 - val_accuracy: 1.0000 - val_loss: 1.2103e-06
Epoch 6/200
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 3.0353e-05 - val_accuracy: 1.0000 - val_loss: 1.1957e-06
Epoch 7/200
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 6.9421e-05 - val_accuracy: 1.0000 - val_loss: 1.2020e-06
Epoch 8/200
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 6.2499e-05 - val_accuracy: 1.0000 - val_loss: 1.2558e-06
Epoch 9/200
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 8.4202e-05 - val_accuracy: 1.0000 - val_loss: 1.1266e-06
Epoch 10/200
12/12 - 26s - 2s/step - accuracy: 1.0000 - loss: 9.4916e-05 - val_accuracy: 1.0000 - val_loss: 2.6783e-06
Epoch 11/200
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 8.8229e-06 - val_accuracy: 1.0000 - val_loss: 5.2944e-06
Epoch 12/200
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 9.4572e-06 - val_accuracy: 1.0000 - val_loss: 5.8613e-06
Epoch 13/200
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 0.0011 - val_accuracy: 1.0000 - val_loss: 2.2558e-06
Epoch 14/200
12/12 - 25s - 2s/step - accuracy: 1.0000 - loss: 3.2293e-06 - val_accuracy: 1.0000 - val_loss: 1.5033e-06
Epoch 14: early stopping
Restoring model weights from the end of the best epoch: 9.
In [97]:
plot_training_history(history_vgg16_ffnn_da_model)
No description has been provided for this image

observation¶

  • using early stopping helped to stop the epochs at 14
  • the model is now using best weights from 9th iteration
  • the model loss graph clearly shows this
In [98]:
# performance classification on validation set
performance_val_vgg16_ffnn_da = model_performance_classification(vgg16_ffnn_da_model, x_val_normalized_rgb, y_val_rgb)
print("Performance of VGG16 FFNN Model with data augumentation on Val set of RGB Images:")
print(performance_val_vgg16_ffnn_da)
4/4 ━━━━━━━━━━━━━━━━━━━━ 7s 2s/step
Performance of VGG16 FFNN Model with data augumentation on Val set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
In [99]:
# confusion matrix for validation set
plot_confusion_matrix(vgg16_ffnn_da_model, x_val_normalized_rgb, y_val_rgb, ml=True)
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
No description has been provided for this image
In [100]:
# performance classification on Test set
performance_test_vgg16_ffnn_da = model_performance_classification(vgg16_ffnn_da_model, x_test_normalized_rgb, y_test_rgb)
print("Performance of VGG16 FFNN Model with data augumentation on Test set of RGB Images:")
print(performance_test_vgg16_ffnn_da)
4/4 ━━━━━━━━━━━━━━━━━━━━ 6s 2s/step
Performance of VGG16 FFNN Model with data augumentation on Test set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
In [101]:
#confusion matrix for test set
plot_confusion_matrix(vgg16_ffnn_da_model, x_test_normalized_rgb, y_test_rgb, ml=True)
4/4 ━━━━━━━━━━━━━━━━━━━━ 11s 3s/step
No description has been provided for this image
In [102]:
performance_train_vgg16_ffnn_da = model_performance_classification(vgg16_ffnn_da_model, x_train_normalized_rgb, y_train_rgb)
print("Performance of Basic VGG Model with FFNN and Data Augumentation on Training set of RGB Images:")
print(performance_train_vgg16_ffnn_da)
12/12 ━━━━━━━━━━━━━━━━━━━━ 30s 2s/step
Performance of Basic VGG Model with FFNN and Data Augumentation on Training set of RGB Images:
   Accuracy  Recall  Precision  F1 Score
0       1.0     1.0        1.0       1.0
In [103]:
plot_confusion_matrix(vgg16_ffnn_da_model, x_train_normalized_rgb, y_train_rgb, ml=True)
12/12 ━━━━━━━━━━━━━━━━━━━━ 24s 2s/step
No description has been provided for this image

Visualizing the predictions¶

In [104]:
print("sample visulization of predictions on Validation set ")

plot_sample_predictions_on_val_set(vgg16_ffnn_da_model)
print("sample visulization of predictions on test set ")
plot_sample_predictions_on_test_set(vgg16_ffnn_da_model)
sample visulization of predictions on Validation set 
1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 820ms/step
No description has been provided for this image
sample visulization of predictions on test set 
1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 538ms/step
No description has been provided for this image
Observations¶
  • This Model also performed well and is on par with previous models in performance
  • The training was faster because of the usage of early stopping we provided , even though epcohs are high (stopped at 14 epochs)

Model Performance Comparison and Final Model Selection¶

In [105]:
# load all performance results into a dataframe for comparison
performance_comparison = pd.DataFrame({
    'Model': ['Basic CNN RGB', 'VGG16 RGB', 'VGG16 FFNN RGB', 'VGG16 FFNN DA RGB'],
    'Train Accuracy': [
        performance_train_basic_cnn['Accuracy'].values[0],
        performance_train_basic_vgg['Accuracy'].values[0],
        performance_train_vgg16_ffnn['Accuracy'].values[0],
        performance_train_vgg16_ffnn_da['Accuracy'].values[0]
    ],
    'Val Accuracy': [
        performance_val_basic_cnn['Accuracy'].values[0],
        performance_val_basic_vgg['Accuracy'].values[0],
        performance_val_vgg16_ffnn['Accuracy'].values[0],
        performance_val_vgg16_ffnn_da['Accuracy'].values[0]
    ],
    'Test Accuracy': [
        performance_test_basic_cnn['Accuracy'].values[0],
        performance_test_basic_vgg['Accuracy'].values[0],
        performance_test_vgg16_ffnn['Accuracy'].values[0],
        performance_test_vgg16_ffnn_da['Accuracy'].values[0]
    ]
})
In [106]:
# display the performance comparison
print("Performance Comparison of Different Models:")
print(performance_comparison)
Performance Comparison of Different Models:
               Model  Train Accuracy  Val Accuracy  Test Accuracy
0      Basic CNN RGB             1.0      0.992063       0.992126
1          VGG16 RGB             1.0      1.000000       1.000000
2     VGG16 FFNN RGB             1.0      1.000000       0.992126
3  VGG16 FFNN DA RGB             1.0      1.000000       1.000000

Test Performance¶

In [107]:
testPerformances = pd.concat([
    performance_test_basic_cnn.T,
    performance_test_basic_cnn_gray.T,
    performance_test_basic_vgg.T,
    performance_test_vgg16_ffnn.T,
    performance_test_vgg16_ffnn_da.T
], axis=1)   

testPerformances.columns = ['Basic CNN RGB', 'Basic CNN Gray','VGG16 RGB', 'VGG16 FFNN RGB', 'VGG16 FFNN DA RGB']
print(testPerformances)
           Basic CNN RGB  Basic CNN Gray  VGG16 RGB  VGG16 FFNN RGB  \
Accuracy        0.992126        0.968504        1.0        0.992126   
Recall          0.992126        0.968504        1.0        0.992126   
Precision       0.992249        0.968962        1.0        0.992249   
F1 Score        0.992126        0.968492        1.0        0.992126   

           VGG16 FFNN DA RGB  
Accuracy                 1.0  
Recall                   1.0  
Precision                1.0  
F1 Score                 1.0  

Actionable Insights & Recommendations¶

Recommendations¶

  • Deployment: The VGG model with FFNN (feed forward neural networks) and Data Augmentation VGG16 FFNN DA RGB is suggested for the deployment as we did a trasnfer learning on a pretrained model with data Augmentation.

  • option-2 Deployment: For this use case the base CNN model also performed well, in case of limited computing resources the base CNN model also works well as it is very small and can be run on CPU.

  • pretrained model is suggested because in future if the realtime images change, the learned model can atleast adapt to the situations

Actionable Insights¶

  • We need to collect more images preferbly negative cases of real workers in the field without helmets instead of closeup images and expand the data set to retrain the model with more images.

Other Observations worth to take a note and discuss¶

  • High Model Performance: The VGG-16 based models achieved perfect or near-perfect accuracy on the test set. This indicates that the models are highly effective for the given dataset. The features learned by VGG-16 on ImageNet are highly transferable to this problem.
  • Transfer Learning: The pre-trained VGG-16 model, even without fine-tuning, performed exceptionally well. This highlights the power of transfer learning for computer vision tasks, especially when the dataset is small.
  • Data Quality: The dataset is small, hence all models performed well, including the base CNN without any pretrained model inclusion. The non-helmet images are closeups thats one of the reason the model learned faster
  • Data Augmentation: While data augmentation is a standard practice to prevent overfitting and improve generalization, the model without data augmentation already performed perfectly. With early stopping, the augmented data model training stopped very early. This suggests the original dataset might be relatively easy for the model to learn.

Power Ahead!